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Abstract 

Assuming the validity of the Standard Model, or more generally that possible physics beyond 
it would have only small effects on production cross sections, branching ratios and electroweak 
radiative corrections, I determine the mass of the Higgs boson to Mh = 124.5 ± 0.8 GeV at the 
68% CL. This is arrived at by combining electroweak precision data with the results of Higgs 
boson searches at LEP 2, the Tevatron, and the LHC, as of december of 2011. The statistical 
interpretation of the method does not require a look-elsewhere effect correction. The method is 
then applied to the data available at the time of the 2012 summer conferences. In this case, a 
remarkable bell-shaped Mh distribution is observed, and Mh = 125.5 ±0.5 GeV is extracted. The 
significance of the bulk (signal) region of the distribution of neither experiment actually exceeds 
five standard deviations, but the combination implies a 6.8 a effect. 
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I. INTRODUCTION 



The LHC Collaborations ATLAS [T] and CMS [2] have presented preliminary combina- 
tions of their Standard Model (SM) Higgs boson searches in data sets which correspond in 
the most sensitive channels to integrated luminosities of 4.6 to 4.9 fb _1 of pp collisions at 
y/s = 7 TeV. In addition, the CDF and D0 Collaborations combined results on searches in 
pp collisions at the Tevatron [3] at \fs = 1.96 TeV, based on luminosities ranging from 4.0 
to 8.6 fb _1 . In this brief communication, I analyze their findings simultaneously with earlier 
results from LEP 2 [4] and with constraints from electroweak precision data. The goal is 
to obtain the most likely values for the mass, M#, of the SM Higgs boson by taking all 
experimental information at face value and by explicitly accounting for any tensions or in- 
consistencies between data and background hypotheses, data and signal hypotheses, as well 
as (implicitly) between different data sets. The statistical interpretation is unambiguous 
within Bayesian data analysis, the natural framework [5l [6] for parameter estimation. (For 
an alternative approach, see Ref. [7].) Thus, I give an answer to the question: 'Assuming the 
approximate validity of the SM and allowing all experimental information, what is M#?" . 

This article is organized as follows: Sec. Ill] describes both the method and the data used 



in this work. The results are presented in Sec. Ill, while Sec. |IV| gives conclusions and an 
outlook. Finally, Appendix [A] incorporates significant experimental updates including first 
results from the LHC operating at y/s = 8 TeV. 



II. DATA 

A. Data treatment 

The master equation used for this is given by, 

p(M H ) = e -*Ew(*fe)/2 Q LEp Q Tevatron Q LHC M H \ (1) 

where the first factor is from the precision data, and Qlep(Mh) and QTevatron (Mh) are the 
ratios of the likelihood for the signal of a particular Mh hypothesis plus the background 
(H+B) to that of the background (B) alone [8j. Similarly, Qlhc = Q atlas (Mh) Qcms(Mh)- 
Unfortunately, for the latest ATLAS and CMS data these quantities have not been made 
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publicly available. I construct Q atlas and Qcms through the relation, 

where cr obs can be thought of as an effective observed cross section combining the various 
channels considered by the LHC Collaborations. It is normalized to the corresponding 
Higgs boson cross-section at the reference Mh, i.e. as{Mn) = 1. The errors are in general 
asymmetric, with Aa + (Acx_) pointing in the signal (background) direction. One expects 
A<7 + > A<7_ from Poisson statistics, but cases with Aa + < A<r_ also occur frequently. If 
a fluctuation below the background is seen, cr obs < 0, then Aa + is used in both terms in 
Eq. ([2]), and conversely, whenever <7 b s > 1 then only Acx_ enters. The factorized form ([T| is 
a reflection of the fact that mutual correlations between ATLAS and CMS, and between the 
LHC and the Tevatron, are ignored. This is a justifiable approximation, since in the most 
important regions counting rates are low and therefore statistical uncertainties expected to 
dominate. 



Consider as an example the excess at 126 GeV seen by ATLAS (Sec. II B 1). In this case, 
Xb — 9-8 and since the H+B hypothesis itself is not perfectly matched either, Xh+b = 1 • 1 , 
Eq. (|2]) gives, 

21nQ ATLAS (126 GeV) = -8.7 

Note, that yx% = 3.1 is 0.5 a lower than the quoted local significance of 3.6 a. This can be 
traced to the signal cross section which introduces an additional uncertainty (dominated by 
~ ±20% from the gluon-fusion production [QJ). While it does not affect the p- value for an 
excess over background it does enter this analysis which is based directly on a determination 
of the signal strengthQand thus works to reduce the significance. Conversely, the significance 
for background-like outcomes is enhanced. Thus, Qlhc tends to be on the conservative side in 
the most interesting mass region, which compensates for the neglected correlations between 
ATLAS and CMS. 

This completes the definition of the likelihood model used for this analysis. The last factor 
in Eq. ^ is the (improper) non-informative prior density chosen such that the variable In M# 
has a flat prior which one can argue is the most conservative (least informative) one for a 
1 Alternatively, one could compute Q{Mjj) directly from the p-values for H+B and B hypotheses, but the 

former have not been made available for CMS, and it is preferable to treat both LHC experiments in 

identical ways. 
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FIG. 1. Combination of all direct SM Higgs boson search results (see text). Even at the 5 a level 
of confidence (in the loose frequentist sense), there are only two remaining Mh ranges (ignoring 
another local minimum near 540 GeV), namely 115.6 GeV < Mh < 128.1 GeV and Mh > 584 GeV. 

variable defined over the real numbers. The numerical significance of changing to a prior 
which is flat in Mh itself does not exceed the 0.1 GeV level in the determination of Mh- 



Before discussing the results for p(M#) in Section III, I summarize some of the individual 
findings, and how they enter into this analysis (see also, the recent historical account in 
Ref. M). 



B. Input Data 

1. ATLAS 



ATLAS excludes the Higgs boson mass ranges from 112.7 to 115.5 GeV, from 131 to 
237 GeV, and from 251 to 453 GeV at the 95% CL. An excess of events is observed for a 
Higgs boson mass close to Mh = 126 GeV. The maximum local significance of this excess is 
3.6 a above the expected background, while the probability of such a fluctuation to happen 
anywhere in the full explored Higgs mass domain corresponds to a global significance of 2.3 a. 
The three most sensitive channels in this mass range, H — > 77, H — > ZZ^*> — > £ + £~£ + £~, 
and H — > WW^*' — > £ + u£~u, contribute individual local significances of 2.8 a, 2.1 a, and 
1.4 cr, respectively, to the excess pp. 
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FIG. 2. Combination of all direct SM Higgs boson search results with the indirect precision data. 
Compared to Fig. [T] only the low mass window remains. 

There is also an excess number of H — >■ ZZ candidates around Mh = 244 GeV and 
towards the upper end of the search window (600 GeV). They are of lower significance but 
describe the H+B hypothesis better than the background, given that 

21nQ A TLAs(244 GeV) « 2 In Q atlas (560 GeV) « -3 

are negative. 



2. CMS 

Based on the 77, bb, t + t~, W + W~ , and ZZ decay channels, CMS excludes the Higgs 
mass range from 127 GeV to the upper end of the search interval of 600 GeV (95% CL). In 
the remaining search interval between 110 and 127 GeV two excesses are observed: three can- 
didate H ->• ZZ& -> events were reconstructed consistent with Mh = 119.5 GeV, 
compared to 1.7 (0.7) expected events for the H+B (B) hypothesis. While this is corrobo- 
rated by an excess in H — > WW^ and also in the less significant bb and r + r _ channels, the 
more sensitive H — > 77 channel shows a deficit below background. When combined, Eq. (J5]) 
yields, 

2hiQcMs(H9.5 GeV) = -5.6 
5 



On the other hand, there is an excess in H — > 77 corresponding to M# = 123.5 GeV. 
The signal strength is 1.7 ±0.8 times the expected one which amounts to a local significance 
of 2.3 a. Including the other channels — which are consistent with both the B and H+B 
hypotheses — gives a local (global) significance of 2.6 (1.9) a [2]. When combined these 
data match perfectly with M H = 124 GeV (xh+b = 0)> an d Eq. Q gives, 

2hiQ C Ms(124 GeV) = -6.6, 

where in this case the value of \fx^ — 2-6 agrees exactly with the quoted local significance. 

3. Tevatron 

The most recent combination from the Tevatron is based on 71 mutually exclusive final 
states from CDF and 94 from D0. A small excess of data events is found in the mass range 
between 125 GeV and 155 GeV with 

2hlQ T cvatron(130 GeV) = -1.9, 

while the region between 156 GeV and 177 GeV is excluded at the 95% CL [3J. 

4. LEP 2 

The input from LEP 2 is unchanged with respect to Ref. [8j which was the last analysis 
of the type presented here before the LHC started data taking in earnest. At LEP 2 with 
energies up to y/s ~ 209 GeV, the Higgs boson was searched for in the dominant (~ 74%) 
bb decay channel, produced in the Higgsstrahlung process, e + e~ — > ZH . In addition, the 
H — > t + t~ channel (~ 7%) was studied for the Z boson decaying into two jets. The 
combination |4J of the four experiments, all channels and all \fs values, resulted in the 
nominal lower bound, Mh > 114.4 GeV. However, the combined data are neither particularly 
compatible with the hypothesis Mh = 115 GeV (15% CL), nor with background only (9% 
CL). The reason is that the results by ALEPH are by themselves in very good agreement 
with Mh ~ 114 GeV (due to an excess in the 4-jet channel) thereby strongly rejecting the 
background only hypothesis, while the results based on the other channels and experiments 
(especially DELPHI) are incompatible with any signal. Overall, a signal for 115 GeV < 
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Mh < 119.5 GeV is favored by the data, but not with high significance, 

21nQ LEP (117 GeV) = -1.7 

The combination of all direct search results are illustrated in Fig. [TJ Shown is the \ 2 
difference relative to the most signal-like Higgs mass of 125 GeV, 

Ax ^-21n P{MH) , 
X p(125GeV)' 

where — 21np(125 GeV) = 13.2. This value is indicated by the red line in the figure and 

corresponds to vanishing reach or else to cases where the overall search results are equally 

well (or poorly) described by the H+B and B hypotheses. 



5. Precision Data 



The input electroweak precision data are dominated by the results of the Z-pole exper- 
iments at LEP and the SLC [11] and correspond basically to those described in detail in 
Ref. [I2J. Despite a few discrepancies, the fit describes well the data with a x 2 /d.o.f. = 
45.6/42. The probability of a larger x 2 /d.o.f. is 33%. Only the muon magnetic moment 
anomaly from BNL [H] and the 6-quark forward-backward asymmetry from LEP 1 are cur- 
rently showing large (3.0 a and 2.7 a) deviations. In addition, the polarization asymmetry 
from SLD differs by 1.7 a. The effective z/-quark coupling g\ from NuTeV [15] is nomi- 
nally in conflict with the SM, as well, but the precise status is under investigation by the 
Collaboration. 

By themselves, the precision data give the 1 a result, 

M H = 99+1 GeV > 

which covers exactly the low mass range not yet excluded by CMS. Compared to previous 
analyses [5, 8| the indirect precision data now play a less pronounced role as they do not have 
much discriminatory power within the remaining low mass window, 115.5 GeV < Mh < 
127 GeV, since 

xlw(127 GeV) - *| w (115.5 GeV) = 0.63 

However, they are the only source of information in the high mass region, Mh ^ 600 GeV, 
which is currently beyond the reach of the LHC. And they are crucial to guarantee a nor- 
malizable posterior density. 
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FIG. 3. The normalized probability distribution of Mjj in the low mass region based on all data. 
Shown in green (blue) is the 68% (98.2%) CL highest probability density region. 

The combination of all direct search results with the indirect precision data is shown in 
Fig. [2j Compared to Fig. [I] the high mass region is now also ruled out at the 8 a leveQ 

III. RESULTS 



The main result of this communication is the normalized probability distribution of M# 

(see 



displayed in Fig. 3l which is based on all available data as summarized in Sec. II B 



Fig [7] below for the most recent data). Indicated in green is the 68% CL allowed highest 
probability density region. It is given by the range, 123.7 GeV < Mh < 125.3 GeV, which 
I write in a more colloquial form as, 

M H = 124.5 ± 0.8 GeV, (3) 

even though the central value is close to the minimum within this range rather than repre- 
senting the mode. Eq. ^ does contain, however, both modes at 124 and 125 GeV which 



originate from CMS and ATLAS, respectively. Nominally (as reviewed in Sec. IIB1) the 
latter would be expected to be closer to 126 GeV (see Fig. [1]). However, CMS does not 
see a signal there, and the peak is effectively cut, lowered, and shifted. On the other hand, 



2 This is ignoring the fact that perturbation theory becomes unreliable for Higgs masses near the unitarity 
bound of about 800 GeV and beyond, so that the exclusion in those regions is rather of qualitative nature. 
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FIG. 4. The normalized probability distribution of Mjj in the low mass region based on all data 
except for CMS. Shown in green is the 68% CL highest probability density region, which is given 
by M H = 126.4±?;| GeV. 

the CMS peak at 124 GeV (see Fig. [5]) is perfectly consistent with ATLAS, so its peak 
gets enhanced in the combination. The second CMS peak at 119.5 GeV, however, is clearly 
disfavored by ATLAS. 

Eq. ([3]) also contains the mean of the distribution given by Mh = 124.4 ± 1.0 GeV, and 
is almost identical to the 68% CL central interval, where the median happens to coincide 
with the central value of Eq. ^ 

When taken together the twin peak structure in Fig. [3] contains the highest probability 
density at the 98.2% CL corresponding to 2.4 o. This is obtained by summing the proba- 
bility under each mass bin that is higher than the highest probability mass bin under the 
subleading peak near 119.5 GeV. This may be set against the 3.6 and 2.6 a maximal local 
p-factors for background fluctuations quoted by ATLAS and CMS, respectively, or to the 
"de-rated" significances of 2.2 and 0.6 a after accounting for the so-called look elsewhere 
effect (LEE) [TH] which is applicable if the location for a hypothetical excess is a 'priori un- 
known. The necessity for the LEE adjustment (i.e., accounting for trial factors) arises from 
the frequentist set-up. ATLAS and CMS estimate their trial factors based on observed local 
data fluctuations but the followed procedure [TTJ is bound to be somewhat arbitrary. Among 
other things it depends on what one considers a priori excluded by previous experiments 
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FIG. 5. The normalized probability distribution of Mjj in the low mass region based on all data 
except for ATLAS. Shown in green are the 68% CL highest probability density regions, which are 
given by the two ranges, M H = 118.8±J;g GeV and M H = 123.9±{J GeV. 

or data sets. For example, if "elsewhere" is restricted to the low mass region up to about 
145 GeV, the de-rated significances read 2.5 and 1.9 a, respectively. 

It is amusing that in this way considerations of prior knowledge re-enter the frequentist 
framework, which is sometimes chosen over the Bayesian one specifically to avoid prior 
densities^ But as can be seen from the discussion in the previous paragraph, the dependence 
on prior knowledge of the LEE is very strong in the case of CMS, and still 0.3 a for ATLAS pQ . 
In contrast, it is negligible for the results presented in this work. 



IV. CONCLUSIONS AND OUTLOOK 



I have collected all experimental information relevant to determine the Higgs boson mass, 
and performed a simultaneous analysis. The result, Mh = 124.5 ± 0.8 GeV, is remarkably 
precise and, of course, driven in large parts by the LHC (see Fig.[6]for the probability density 
when the LHC data are removed). Incidentally, Mh is determined to slightly higher absolute 
and slightly lower relative accuracy than the top quark mass [TS], and if the SM is correct 
3 Any reasonable probabilistic model is necessarily equivalent to a Bayesian model with some prior even 

though the choice of prior may be highly implicit. This is so, because the proof of Bayes' theorem needs 

no assumptions other than the axioms of probability theory. 
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FIG. 6. The normalized probability distribution of Mjj in the low and intermediate mass regions 
based on all data except for the LHC. Shown in green is the 68% CL highest probability density. 



then the mass of the Higgs boson would have been measured accurately before its existence 
is indisputably confirmed. 

In addition to providing a well-defined determination of Mh, the Bayesian statistical 
model employed here also establishes an unambiguous measure of significance. The highest 
probability density under the twin peaks in Fig. [3] integrates to 98.2% corresponding to 
2.4 a. Assuming the two LHC experiments see identical results with the next two or three 
data sets of the same size, this would increase to 4.6 a or 5.4 a, respectively. Thus, the 
conventional 5 a should be reached roughly with an additional 12 fb -1 per experiment of 
data. This can be achieved in 2012 with a luminosity corresponding to a one third increase 
relative to a good LHC week in October of 2011 (for example, by decreasing the effective 
beam size, (3*, by about 25%). 



Appendix A: Updated notes and figures 

While this article was under consideration for publication, both the ATLAS [T9] and 
CMS (20] Collaborations announced the observation of a new boson with mass, respectively, 
given by M = 126.0 ± 0.4 ± 0.4 GeV and M = 125.3 ± 0.4 ± 0.5 GeV. A simple weighted 
average would give, M = 125.7 ± 0.4 GeV. The datasets used correspond to integrated 
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FIG. 7. The normalized probability distribution of Mjj in the low mass region based on all data 
available in the summer of 2012, with the 68% CL highest probability density region highlighted 
(in green) . Also shown are two reference Gaussian: (i) the dashed one (in black) is centered around 
the median, Mh = 125.50 GeV, and has the same width as the 68% CL central probability interval, 
±0.46 GeV; (ii) the solid one (in red) is based on mean and variance, Mh = 125.49 ± 0.50 GeV. 

luminosities of up to approximately 5.1 fb _1 collected at \/s = 7 TeV in 2011 and up to 
5.8 fb _1 at y/s = 8 TeV in 2012, and the local significances are quoted at 5.9 and 5.0 a 
for the two experiments, respectively. The CDF and D0 Collaborations [21J also released a 
much improved analysis of their data, revealing an excess in the 115 to 140 GeV mass range 
with a local significance of 3.0 a. 

Here I update the results reflecting these developments. The 68% CL allowed highest 
probability density range is now 125.02 GeV < Mh < 125.95 GeV, or in short, 



M. 



H 



125.5 ±0.5 GeV. 



(Al) 



Unlike the remarks following Eq. ([3]), the latest data combine to a nearly bell-shaped curve 
as shown in Fig. [7| with coinciding mean, median, and mode. The significance of the bulk 
region of the probability distribution, 121.1 GeV < Mh < 130.0 GeV — according to the 
method introduced here and illustrated in Fig. [8] — is 6.8 a, i.e., the tail regions contain a 
probability of 9 x 10~ 12 . Given the greater effectiveness of the 8 TeV LHC for Higgs searches, 



this is is consistent with but exceeds somewhat the expectation expressed in Sec. IV 
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FIG. 8. Combination of all direct SM Higgs boson search results with the indirect precision data, 
following the updates for the summer conferences of 2012. This is to be compared with Fig. [2j 
but here the focus is on the low mass region. It is illustrated how the bulk (signal) region can be 
defined unambiguously, permitting the extraction of the total signal probability. Using the inverse 
error function, this can then be translated back into the number of standard deviations of the 
signal without reference to the LEE. 



Similarly, Fig. [9] (Fig. 10) shows the corresponding distribution for the combination of 
the ATLAS (CMS) and LEP 2 Higgs search results with the electroweak precision data. The 
significance of the ATLAS bulk region is 4.9 a and larger that the one from CMS (4.2 a), 
but the CMS data are slightly sharper peaked as can be seen from the reference Gaussian 
densities also shown in the figures. The probability that the true Mjj resides in one of the 
tails is close to 10 -6 for ATLAS and 3 x 1CT 5 for CMS. These numbers are several orders of 
magnitude larger than the background fluctuation probabilities (p- values) of 1.7 x 10~ 9 and 
3 x 10~ 7 , respectively. 



Finally, Fig. 11 shows the case for the combination of the CDF, D0 and LEP 2 Higgs 
search results with the electroweak precision data, with a bulk region significance of 3.4 a. 
The tail probability of 6 x 10~ 4 is in this case lower than the background p- value of 1.5 x 10~ 3 . 

13 



ATLAS + LEP 2 + electroweak 

(ICHEP2012) 



13 
12 
11 

> 10 

« 9 

o 8 

a. 7 
> 

= 6 
!5 

5 5 

o 

B. 4 

S« 3 
2 
1 

1 


j 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


■ 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 1 


1 1 1 1 1 1 1 1 L 




















: 




















: 




















: 




















: 














































































































ri 




fcr 
























1 












tau 


f 


>0 1 


n 1: 


2 1 


>3 1, 


>4 125 126 127 1, 


8 1 


19 i; 



M H [GeV] 



FIG. 9. The normalized probability distribution of Mjj in the low mass region based on Higgs 
search results from LEP 2 and ATLAS, as well as the electroweak precision data, with the 68% CL 
highest probability density region highlighted (in green). Also shown are the median (in black) and 
mean (in red) motivated reference Gaussian densities (cf. Fig. [7]), with Mjj = 126.30 ± 0.72 GeV 
and Mu = 126.28 ± 0.71 GeV, respectively, which happen to be almost identical in this case. 
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